Heterogeneous ensemble selection for evolving data streams

نویسندگان

چکیده

Ensemble learning has been widely applied to both batch data classification and streaming classification. For the latter setting, most existing ensemble systems are homogenous, which means they generated from only one type of model. In contrast, by combining several types different models, a heterogeneous system can achieve greater diversity among its members, helps improve performance. Although have achieved many successes in it is not trivial extend them directly stream setting. this study, we propose novel HEterogeneous Selection (HEES) method, dynamically selects an appropriate subset base classifiers predict under We inspired observation that well-chosen good may outperform whole system. Here, define candidate as expresses high predictive performance but also confidence prediction. Our selection process thus divided into two sub-processes: accurate-candidate confident-candidate selection. accurate context classifier with accuracy over current concept, while confident score higher than certain threshold. first sub-process, employ prequential estimate at specific time, new measure quantify provide method learn threshold incrementally. The final formed taking intersection sets classifiers. Experiments on wide range streams show proposed achieves competitive lower running time comparison state-of-the-art online methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Heterogeneous Ensemble for Feature Drifts in Data Streams

The nature of data streams requires classification algorithms to be real-time, efficient, and able to cope with high-dimensional data that are continuously arriving. It is a known fact that in high-dimensional datasets, not all features are critical for training a classifier. To improve the performance of data stream classification, we propose an algorithm called HEFT-Stream (Heterogeneous Ense...

متن کامل

Online Ensemble Learning for Imbalanced Data Streams

While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited. Aiming at this challenge task, a novel learning framework is proposed in this paper. The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms. Within th...

متن کامل

Leveraging Bagging for Evolving Data Streams

Bagging, boosting and Random Forests are classical ensemble methods used to improve the performance of single classifiers. They obtain superior performance by increasing the accuracy and diversity of the single classifiers. Attempts have been made to reproduce these methods in the more challenging context of evolving data streams. In this paper, we propose a new variant of bagging, called lever...

متن کامل

Classifying Evolving Data Streams for Intrusion Detection

Stream data classification is a challenging problem because of two important properties: its infinite length and evolving nature. Traditional learning algorithms that require several passes on the training data are not directly applicable to stream classification problem because of the infinite length of the data stream. Data streams may evolve in several ways: the prior probability distributio...

متن کامل

Mining evolving data streams for frequent patterns

A data stream is a potentially uninterrupted flow of data. Mining this flow makes it necessary to cope with uncertainty, as only a part of the stream can be stored. In this paper, we evaluate a statistical technique which biases the estimation of the support of patterns, so as to maximize either the precision or the recall, as chosen by the user, and limit the degradation of the other criterion...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition

سال: 2021

ISSN: ['1873-5142', '0031-3203']

DOI: https://doi.org/10.1016/j.patcog.2020.107743